AITopics | Western Region

2512.06341

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Chimpanzees' brutal battle for territory leads to a baby boom

Chimpanzees' brutal battle for territory leads to a baby boom A rival chimp can die in less than 15 minutes during these deadly territorial fights. New research led by UCLA and the University of Michigan has shown that chimp communities that kill their neighbors to gain territory also gain reproductive advantages. Breakthroughs, discoveries, and DIY tips sent every weekday. Uganda's Ngogo chimpanzees are well known for their "chimpanzee warfare." Primatologists have observed their brutal, lethal fights between 10 or more chimpanzees for decades, deciphering what leads to such violence.

artificial intelligence, chimpanzee, territory, (16 more...)

Popular Science

Country:

North America > United States > Michigan (0.25)
North America > United States > California > Los Angeles County > Los Angeles (0.15)
North America > United States > New Jersey (0.05)
(4 more...)

Genre: Research Report > New Finding (0.69)

Industry: Health & Medicine > Therapeutic Area (0.31)

Technology: Information Technology > Artificial Intelligence (0.50)

Variational Geometric Information Bottleneck: Learning the Shape of Understanding

arXiv.org Artificial IntelligenceNov-5-2025

We propose a unified information-geometric framework that formalizes understanding in learning as a trade-off between informativeness and geometric simplicity. An encoder ϕ is evaluated by U(ϕ): = I(ϕ(X);Y) βC(ϕ), where C(ϕ) penalizes curvature and intrinsic dimensionality, enforcing smooth, low-complexity manifolds. Under mild manifold and regularity assumptions, we derive non-asymptotic bounds showing that generalization error scales with intrinsic dimension while curvature controls approximation stability, directly linking geometry to sample efficiency. To operationalize this theory, we introduce the Varia-tional Geometric Information Bottleneck (V-GIB); a varia-tional estimator that unifies mutual-information compression and curvature regularization through tractable geometric proxies (Hutchinson trace, Jacobian norms, and local PCA). Experiments across synthetic manifolds, few-shot settings, and real-world datasets (Fashion-MNIST, CIFAR-10) reveal a robust information-geometry Pareto frontier, stable estimators, and substantial gains in interpretive efficiency. Notably, fractional-data experiments on CIFAR-10 confirm that curvature-aware encoders maintain predictive power under data scarcity, validating the predicted efficiency-curvature law. Overall, V-GIB provides a principled and measurable route to representations that are geometrically coherent, data-efficient, and aligned with human-understandable structure. Keywords: geometry of understanding; information bottleneck; curvature regularization; few-shot learning; mutual information; Hutchinson trace estimator; inter-pretability; human-machine alignment.

artificial intelligence, efficiency, machine learning, (17 more...)

2511.02496

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey > Hudson County > Hoboken (0.04)
Africa > Uganda > Western Region > Kabale District (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Non-Asymptotic Stability and Consistency Guarantees for Physics-Informed Neural Networks via Coercive Operator Analysis

arXiv.org Artificial IntelligenceSep-4-2025

We present a unified theoretical framework for analyzing the stability and consistency of Physics-Informed Neural Networks (PINNs), grounded in operator coercivity, variational formulations, and non-asymptotic perturbation theory. PINNs approximate solutions to partial differential equations (PDEs) by minimizing residual losses over sampled collocation and boundary points. We formalize both operator-level and variational notions of consistency, proving that residual minimization in Sobolev norms leads to convergence in energy and uniform norms under mild regularity. Deterministic stability bounds quantify how bounded perturbations to the network outputs propagate through the full composite loss, while probabilistic concentration results via McDiarmid's inequality yield sample complexity guarantees for residual-based generalization. A unified generalization bound links residual consistency, projection error, and perturbation sensitivity. Empirical results on elliptic, parabolic, and nonlinear PDEs confirm the predictive accuracy of our theoretical bounds across regimes. The framework identifies key structural principles, such as operator coercivity, activation smoothness, and sampling admissibility, that underlie robust and generalizable PINN training, offering principled guidance for the design and analysis of PDE-informed learning systems.

artificial intelligence, convergence, machine learning, (17 more...)

2506.13554

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Africa > Uganda > Western Region > Kabale District (0.04)
Africa > Uganda > Central Region > Kampala (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Nagori, Aditya, Gautam, Ayush, Wiens, Matthew O., Nguyen, Vuong, Mugisha, Nathan Kenya, Kabakyenga, Jerome, Kissoon, Niranjan, Ansermino, John Mark, Kamaleswaran, Rishikesan

Contextual Phenotyping of Pediatric Sepsis Cohort Using Large Language Models

arXiv.org Artificial IntelligenceAug-5-2025

Clustering patient subgroups is essential for personalized care and efficient resource use. Traditional clustering methods struggle with high-dimensional, heterogeneous healthcare data and lack contextual understanding. This study evaluates Large Language Model (LLM) based clustering against classical methods using a pediatric sepsis dataset from a low-income country (LIC), containing 2,686 records with 28 numerical and 119 categorical variables. Patient records were serialized into text with and without a clustering objective. Embeddings were generated using quantized LLAMA 3.1 8B, DeepSeek-R1-Distill-Llama-8B with low-rank adaptation(LoRA), and Stella-En-400M-V5 models. K-means clustering was applied to these embeddings. Classical comparisons included K-Medoids clustering on UMAP and FAMD-reduced mixed data. Silhouette scores and statistical tests evaluated cluster quality and distinctiveness. Stella-En-400M-V5 achieved the highest Silhouette Score (0.86). LLAMA 3.1 8B with the clustering objective performed better with higher number of clusters, identifying subgroups with distinct nutritional, clinical, and socioeconomic profiles. LLM-based methods outperformed classical techniques by capturing richer context and prioritizing key features. These results highlight potential of LLMs for contextual phenotyping and informed decision-making in resource-limited settings.

large language model, machine learning, natural language, (20 more...)

2505.09805

Country:

North America > Canada > British Columbia > Vancouver (0.05)
Africa > Uganda > Western Region > Mbarara District (0.05)
North America > United States > North Carolina > Durham County > Durham (0.04)
(8 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Causal Operator Discovery in Partial Differential Equations via Counterfactual Physics-Informed Neural Networks

arXiv.org Artificial IntelligenceJun-26-2025

We develop a principled framework for discovering causal structure in partial differential equations (PDEs) using physics-informed neural networks and counterfactual perturbations. Unlike classical residual minimization or sparse regression methods, our approach quantifies operator-level necessity through functional interventions on the governing dynamics. We introduce causal sensitivity indices and structural deviation metrics to assess the influence of candidate differential operators within neural surrogates. Theoretically, we prove exact recovery of the causal operator support under restricted isometry or mutual coherence conditions, with residual bounds guaranteeing identifiability. Empirically, we validate the framework on both synthetic and real-world datasets across climate dynamics, tumor diffusion, and ocean flows. Our method consistently recovers governing operators even under noise, redundancy, and data scarcity, outperforming standard PINNs and DeepONets in structural fidelity. This work positions causal PDE discovery as a tractable and interpretable inference task grounded in structural causal models and variational residual analysis.

artificial intelligence, machine learning, operator, (17 more...)

2506.20181

Country:

Africa > Mozambique (0.04)
Indian Ocean > Somali Basin > Mozambique Channel (0.04)
Africa > East Africa (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Structured Variational $D$-Decomposition for Accurate and Stable Low-Rank Approximation

arXiv.org Artificial IntelligenceJun-11-2025

We introduce the $D$-decomposition, a non-orthogonal matrix factorization of the form $A \approx P D Q$, where $P \in \mathbb{R}^{n \times k}$, $D \in \mathbb{R}^{k \times k}$, and $Q \in \mathbb{R}^{k \times n}$. The decomposition is defined variationally by minimizing a regularized Frobenius loss, allowing control over rank, sparsity, and conditioning. Unlike algebraic factorizations such as LU or SVD, it is computed by alternating minimization. We establish existence and perturbation stability of the solution and show that each update has complexity $\mathcal{O}(n^2k)$. Benchmarks against truncated SVD, CUR, and nonnegative matrix factorization show improved reconstruction accuracy on MovieLens, MNIST, Olivetti Faces, and gene expression matrices, particularly under sparsity and noise.

artificial intelligence, decomposition, machine learning, (17 more...)

2506.08535

Country:

North America > United States > Maryland > Baltimore (0.04)
Africa > Uganda > Western Region > Kabale District (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Kimera, Richard, Heo, Dongnyeong, Rim, Daniela N., Choi, Heeyoul

Data Augmentation With Back translation for Low Resource languages: A case of English and Luganda

arXiv.org Artificial IntelligenceMay-6-2025

In this paper,we explore the application of Back translation (BT) as a semi-supervised technique to enhance Neural Machine Translation(NMT) models for the English-Luganda language pair, specifically addressing the challenges faced by low-resource languages. The purpose of our study is to demonstrate how BT can mitigate the scarcity of bilingual data by generating synthetic data from monolingual corpora. Our methodology involves developing custom NMT models using both publicly available and web-crawled data, and applying Iterative and Incremental Back translation techniques. We strategically select datasets for incremental back translation across multiple small datasets, which is a novel element of our approach. The results of our study show significant improvements, with translation performance for the English-Luganda pair exceeding previous benchmarks by more than 10 BLEU score units across all translation directions. Additionally, our evaluation incorporates comprehensive assessment metrics such as SacreBLEU, ChrF2, and TER, providing a nuanced understanding of translation quality. The conclusion drawn from our research confirms the efficacy of BT when strategically curated datasets are utilized, establishing new performance benchmarks and demonstrating the potential of BT in enhancing NMT models for low-resource languages.

artificial intelligence, machine translation, natural language, (15 more...)

doi: 10.1145/3711542.3711594

2505.02463

Country:

Asia > Japan > Honshū > Chūgoku > Okayama Prefecture > Okayama (0.06)
Asia > South Korea (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report (0.50)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Musinguzi, Denis, Katumba, Andrew, Murindanyi, Sudi

PaliGemma-CXR: A Multi-task Multimodal Model for TB Chest X-ray Interpretation

arXiv.org Artificial IntelligenceFeb-28-2025

Tuberculosis (TB) is a infectious global health challenge. Chest X-rays are a standard method for TB screening, yet many countries face a critical shortage of radiologists capable of interpreting these images. Machine learning offers an alternative, as it can automate tasks such as disease diagnosis, and report generation. However, traditional approaches rely on task-specific models, which cannot utilize the interdependence between tasks. Building a multi-task model capable of performing multiple tasks poses additional challenges such as scarcity of multimodal data, dataset imbalance, and negative transfer. To address these challenges, we propose PaliGemma-CXR, a multi-task multimodal model capable of performing TB diagnosis, object detection, segmentation, report generation, and VQA. Starting with a dataset of chest X-ray images annotated with TB diagnosis labels and segmentation masks, we curated a multimodal dataset to support additional tasks. By finetuning PaliGemma on this dataset and sampling data using ratios of the inverse of the size of task datasets, we achieved the following results across all tasks: 90.32% accuracy on TB diagnosis and 98.95% on close-ended VQA, 41.3 BLEU score on report generation, and a mAP of 19.4 and 16.0 on object detection and segmentation, respectively. These results demonstrate that PaliGemma-CXR effectively leverages the interdependence between multiple image interpretation tasks to enhance performance.

dataset, paligemma-cxr, segmentation mask, (12 more...)

2503.00171

Country:

Africa > Uganda > Western Region > Mbarara District (0.04)
Africa > South Africa (0.04)

Genre: Research Report > New Finding (0.66)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Tec, Mauricio, Xiong, Guojun, Wang, Haichuan, Dominici, Francesca, Tambe, Milind

Rule-Bottleneck Reinforcement Learning: Joint Explanation and Decision Optimization for Resource Allocation with Language Agents

arXiv.org Artificial IntelligenceFeb-15-2025

Deep Reinforcement Learning (RL) is remarkably effective in addressing sequential resource allocation problems in domains such as healthcare, public policy, and resource management. However, deep RL policies often lack transparency and adaptability, challenging their deployment alongside human decision-makers. In contrast, Language Agents, powered by large language models (LLMs), provide human-understandable reasoning but may struggle with effective decision making. To bridge this gap, we propose Rule-Bottleneck Reinforcement Learning (RBRL), a novel framework that jointly optimizes decision and explanations. At each step, RBRL generates candidate rules with an LLM, selects among them using an attention-based RL policy, and determines the environment action with an explanation via chain-of-thought reasoning. The RL rule selection is optimized using the environment rewards and an explainability metric judged by the LLM. Evaluations in real-world scenarios highlight RBRL's competitive performance with deep RL and efficiency gains over LLM fine-tuning. A survey further confirms the enhanced quality of its explanations.

explanation, large language model, machine learning, (13 more...)

2502.10732

Country:

Europe > Portugal > Braga > Braga (0.04)
Asia > Middle East > Israel (0.04)
Africa > Uganda > Western Region > Mbarara District (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.67)

Industry:

Law (0.87)
Government (0.87)
Health & Medicine > Therapeutic Area (0.67)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)